Active learning for misspecified generalized linear models
نویسنده
چکیده
Active learning refers to algorithmic frameworks aimed at selecting training data points in order to reduce the number of required training data points and/or improve the generalization performance of a learning method. In this paper, we present an asymptotic analysis of active learning for generalized linear models. Our analysis holds under the common practical situation of model misspecification, and is based on realistic assumptions regarding the nature of the sampling distributions, which are usually neither independent nor identical. We derive unbiased estimators of generalization performance, as well as estimators of expected reduction in generalization error after adding a new training data point, that allow us to optimize its sampling distribution through a convex optimization problem. Our analysis naturally leads to an algorithm for sequential active learning which is applicable for all tasks supported by generalized linear models (e.g., binary classification, multi-class classification, regression) and can be applied in non-linear settings through the use of Mercer kernels.
منابع مشابه
Active Learning for Misspecified Models
Active learning is the problem in supervised learning to design the locations of training input points so that the generalization error is minimized. Existing active learning methods often assume that the model used for learning is correctly specified, i.e., the learning target function can be expressed by the model at hand. In many practical situations, however, this assumption may not be fulf...
متن کاملEmpirical best linear unbiased prediction in misspecified and improved panel data models with an application to gasoline demand
Misspecifications in econometric models can result in misestimated coefficients.An improved method for specifying econometric models is presented. The mean square error of an empirical best linear unbiased predictor of an individual drawing for the dependent variable of an improved model is derived. These ideas are illustrated using certain misspecified and improved models of the demand for gas...
متن کاملMisspecified Linear Bandits
We consider the problem of online learning in misspecified linear stochastic multi-armed bandit problems. Regret guarantees for state-of-the-art linear bandit algorithms such as Optimism in the Face of Uncertainty Linear bandit (OFUL) hold under the assumption that the arms expected rewards are perfectly linear in their features. It is, however, of interest to investigate the impact of potentia...
متن کاملDust source mapping using satellite imagery and machine learning models
Predicting dust sources area and determining the affecting factors is necessary in order to prioritize management and practice deal with desertification due to wind erosion in arid areas. Therefore, this study aimed to evaluate the application of three machine learning models (including generalized linear model, artificial neural network, random forest) to predict the vulnerability of dust cent...
متن کاملRobust prediction and extrapolation designs for misspecified generalized linear regression models
We study minimax robust designs for response prediction and extrapolation in biased linear regression models. We extend previous work of others by considering a nonlinear fitted regression response, by taking a rather general extrapolation space and, most significantly, by dropping all restrictions on the structure of the regressors. Several examples are discussed. © 2007 Elsevier B.V. All righ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006